biostat_corefandomcom-20200215-history
Survival Analysis
Survival analysis is a useful tool for looking at "time until event" data. Most commonly, survival analysis looks at time until death. Other commonly investigated endpoints include disease recurrence, progression and metastasis. Typically, survival data will be in the following format: The Status column indicates with a '1' if the subject was observed to have the event (death, recurrence, progression etc.) and a '0' if the subject was censored. In this context 'censoring' just means that the subject was not observed to have the event. Censoring most commonly occurs when either the subject could no longer be contacted (loss to follow-up) or if the study has ended and the subject did not have the event occur. For example, subject 1's death occurred after 250 days of observation. Whereas subject 5 was observed for 87 days and were censored. Kaplan-Meier Curves A good way to visualize survival data is with Kaplan-Meier curves, these show survival over time based on the data. Below is an example of Kaplan-Meier curves looking at survival for two different lung cancer treatments: From these curves we can see that it appears that the Experimental treatment has a quicker drop-off in survival than the Standard treatment. However, it looks like that among those who survived the initial drop off, the Experimental treatment had a slightly better survivor. Although, the survival curves look relatively close to each other. Various point estimates can be inferred from the graph as well. The point at which 50% of remaining subjects are alive seems to be before 100 days for the Experimental group, and after 100 days for the Standard treatment. We can than look at the Kaplan-Meier curves with 95% confidence bands: This graph helps to better show the uncertainty in the survival estimates. As we can see, there is quite a bit of overlap between the confidence bands. This likely indicates that we don't have enough evidence to conclude that those who were on the Standard treatment had significantly different survival outcomes than those on the Experimental treatment. We can also do a formal test to see if the two curves differ significantly from each other. This is called the log-rank test. For these two curves, the p-value is 0.93. This confirms what we have seen visually, that the curves seem to not be much different. In order for a significant difference to be observed, the curves would need to be well separated from each other across the time span. Cox Proportional-Hazards Model Kaplan-Meier curves are useful when only one categorical predictor is of interest. In order to incorporate continuous factors (age, weight, etc.) and multiple categorical predictors, the Cox Proportional-Hazards model can be used. The output from this model is similar to that of Logistic Regression, but instead of an odds ratio, we get a hazard ratio (HR). The hazard function is the probability of experiencing the event (death, recurrence etc.) at a certain time. Like an odds ratio, a hazard ratio greater than 1 indicates increased hazard, whereas a hazard ratio less than 1 indicates decreased hazard. A more in-depth example can be found here. Example The following data is from a different source, but also looks at various factors on survival of lung cancer patients. We will fit a Cox Proportional-Hazards model that will look at the effects of age,sex, and ECOG score on survival. The ECOG score is used to measure physical ability and ranges from 0-5, where 0 indicates no activity restrictions and 5 indicates death. Model results: The important column to look at is the exp(coef) column, which gives the hazard ratios, (coef column gives log of the hazard ratio). The column labeled Pr(>|z|) is the p-value. We can see that sex (female) and ECOG score were significantly associated with time until death. Being female resulted in a lower hazard, almost half of that compared to men, while a higher ECOG score results in a greater hazard. For ECOG score this intuitively makes sense, since a higher ECOG score indicates greater physical impairment.